AITopics | synthetic language

Collaborating Authors

synthetic language

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

29daf9442f3c0b60642b14c081b4a556-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 00:12:24 GMT

agent, generalization, relation, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Austria > Vienna (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

M-CIF: Multi-Scale Alignment For CIF-Based Non-Autoregressive ASR

Mao, Ruixiang, Ma, Xiangnan, Yang, Qing, Zhu, Ziming, Qiao, Yucheng, Ge, Yuan, Xiao, Tong, Gao, Shengxiang, Yu, Zhengtao, Zhu, Jingbo

arXiv.org Artificial IntelligenceOct-28-2025

The Continuous Integrate-and-Fire (CIF) mechanism provides effective alignment for non-autoregressive (NAR) speech recognition. This mechanism creates a smooth and monotonic mapping from acoustic features to target tokens, achieving performance on Mandarin competitive with other NAR approaches. However, without finer-grained guidance, its stability degrades in some languages such as English and French. In this paper, we propose Multi-scale CIF (M-CIF), which performs multi-level alignment by integrating character and phoneme level supervision progressively distilled into subword representations, thereby enhancing robust acoustic-text alignment. Experiments show that M-CIF reduces WER compared to the Paraformer baseline, especially on CommonVoice by 4.21% in German and 3.05% in French. To further investigate these gains, we define phonetic confusion errors (PE) and space-related segmentation errors (SE) as evaluation metrics. Analysis of these metrics across different M-CIF settings reveals that the phoneme and character layers are essential for enhancing progressive CIF alignment.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.22172

Country: Asia > China (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.38)

Add feedback

Searching for the Most Human-like Emergent Language

Boldt, Brendon, Mortensen, David

arXiv.org Artificial IntelligenceOct-7-2025

In this paper, we design a signalling game-based emergent communication environment to generate state-of-the-art emergent languages in terms of similarity to human language. This is done with hyperparameter optimization, using XferBench as the objective function. XferBench quantifies the statistical similarity of emergent language to human language by measuring its suitability for deep transfer learning to human language. Additionally, we demonstrate the predictive power of entropy on the transfer learning performance of emergent language as well as corroborate previous results on the entropy-minimization properties of emergent communication systems. Finally, we report generalizations regarding what hyperparameters produce more realistic emergent languages, that is, ones which transfer better to human language.

emergent language, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.03467

Country:

North America > United States (0.46)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Grounding Spatio-Temporal Language with Transformers

Neural Information Processing SystemsOct-3-2025, 01:53:50 GMT

Language is an interface to the outside world.

machine learning, natural language, reinforcement learning, (22 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Austria > Vienna (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Transfer of Structural Knowledge from Synthetic Languages

Budnikov, Mikhail, Yamshchikov, Ivan

arXiv.org Artificial IntelligenceMay-22-2025

This work explores transfer learning from several synthetic languages to English. We investigate the structure of the embeddings in the fine-tuned models, the information they contain, and the capabilities of the fine-tuned models on simple linguistic tasks. We also introduce a new synthetic language that leads to better transfer to English than the languages used in previous research. Finally, we introduce Tiny-Cloze Benchmark - a new synthetic benchmark for natural language understanding that is more informative for less powerful models. We use Tiny-Cloze Benchmark to evaluate fine-tuned models in several domains demonstrating that fine-tuning on a new synthetic language allows for better performance on a variety of tasks.

large language model, machine learning, synthetic language, (17 more...)

arXiv.org Artificial Intelligence

2505.15769

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Morphological Typology in BPE Subword Productivity and Language Modeling

Parra, Iñigo

arXiv.org Artificial IntelligenceOct-31-2024

This study investigates the impact of morphological typology on tokenization and language modeling performance. We focus on languages with synthetic and analytical morphological structures and examine their productivity when tokenized using the byte-pair encoding (BPE) algorithm. We compare the performance of models trained with similar amounts of data in different languages. Our experiments reveal that languages with synthetic features exhibit greater subword regularity and productivity with BPE tokenization and achieve better results in language modeling tasks. We also observe that the typological continuum from linguistic theory is reflected in several experiments. These findings suggest a correlation between morphological typology and BPE tokenization efficiency.

computational linguistic, experiment, synthetic language, (13 more...)

arXiv.org Artificial Intelligence

2410.23656

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

The Model Is The Message

#artificialintelligenceJul-16-2022, 09:15:10 GMT

An odd controversy appeared in the news cycle last month when a Google engineer, Blake Lemoine, was placed on leave after publicly releasing transcripts of conversations with LaMDA, a chatbot based on a Large Language Model (LLM) that he claims is conscious, sentient and a person. Like most other observers, we do not conclude that LaMDA is conscious in the ways that Lemoine believes it to be. His inference is clearly based in motivated anthropomorphic projection. At the same time, it is also possible that these kinds of artificial intelligence (AI) are "intelligent" -- and even "conscious" in some way -- depending on how those terms are defined. Still, neither of these terms can be very useful if they are defined in strongly anthropocentric ways. An AI may also be one and not the other, and it may be useful to distinguish sentience from both intelligence and consciousness. For example, an AI may be genuinely intelligent in some way but only sentient in the restrictive sense of sensing and acting deliberately on external information. Perhaps the real lesson for philosophy of AI is that reality has outpaced the available language to parse what is already at hand. A more precise vocabulary is essential.

artificial intelligence, large language model, natural language, (15 more...)

#artificialintelligence

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Add feedback

Grounding Spatio-Temporal Language with Transformers

Karch, Tristan, Teodorescu, Laetitia, Hofmann, Katja, Moulin-Frier, Clément, Oudeyer, Pierre-Yves

arXiv.org Artificial IntelligenceJun-16-2021

Language is an interface to the outside world. In order for embodied agents to use it, language must be grounded in other, sensorimotor modalities. While there is an extended literature studying how machines can learn grounded language, the topic of how to learn spatio-temporal linguistic concepts is still largely uncharted. To make progress in this direction, we here introduce a novel spatio-temporal language grounding task where the goal is to learn the meaning of spatio-temporal descriptions of behavioral traces of an embodied agent. This is achieved by training a truth function that predicts if a description matches a given history of observations. The descriptions involve time-extended predicates in past and present tense as well as spatio-temporal references to objects in the scene. To study the role of architectural biases in this task, we train several models including multimodal Transformer architectures; the latter implement different attention computations between words and objects across space and time. We test models on two classes of generalization: 1) generalization to randomly held-out sentences; 2) generalization to grammar primitives. We observe that maintaining object identity in the attention computation of our Transformers is instrumental to achieving good performance on generalization overall, and that summarizing object traces in a single token has little influence on performance. We then discuss how this opens new perspectives for language-guided autonomous embodied agents. We also release our code under open-source license as well as pretrained models and datasets to encourage the wider community to build upon and extend our work in the future.

agent, generalization, relation, (17 more...)

arXiv.org Artificial Intelligence

2106.08858

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Austria > Vienna (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Researchers say we need better benchmarks to build more useful AI assistants

#artificialintelligenceAug-9-2020, 05:30:32 GMT

The promise of conversational AI is that, unlike virtually any other form of technology, all you have to do is talk. Natural language is the most natural and democratic form of communication. After all, humans are born capable of learning how to speak, but some never learn to read or use a graphical user interface. That's why AI researchers from Element AI, Stanford University, and CIFAR recommend academic researchers take steps to create more useful forms of AI that speak with people to get things done, including the elimination of existing benchmarks. "As many current [language user interface] benchmarks suffer from low ecological validity, we recommend researchers not to initiate incremental research projects on them. Benchmark-specific advances are less meaningful when it is unclear if they transfer to real LUI use cases. Instead, we suggest the community to focus on conceptual research ideas that can generalize well beyond the current datasets," the paper reads.

artificial intelligence, benchmark, natural language, (16 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.40)

Industry: Health & Medicine (0.51)

Technology:

Information Technology > Human Computer Interaction > Interfaces (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.57)

Add feedback